[fix](fe) Fix isCountStar incorrectly treating count(null) as count(*) by yujun777 · Pull Request #62548 · apache/doris

yujun777 · 2026-04-16T03:56:41Z

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Count.isCountStar() returns true for count(null) because it only checks child(0) instanceof Literal without excluding NullLiteral. Since count(null) always returns 0 (it counts non-null values of a constant null), it is semantically different from count(*) which counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g. PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll) use isCountStar() before CountLiteralRewrite has a chance to rewrite count(null) to 0.

Release note

Fixed count(null) being incorrectly treated as count(*) in query optimizer, which could lead to wrong results in certain rewrite scenarios.

Check List (For Author)

Test: Unit Test / Regression test
- FE UT: CountLiteralRewriteTest (3 tests pass, including new testCountNullIsNotCountStar)
- Regression test: count_null_not_count_star covering basic, group by, mixed, union all, window, and join paths
Behavior changed: No
Does this need documentation: No

Thearas · 2026-04-16T03:56:47Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

yujun777 · 2026-04-16T03:56:50Z

run buildall

yujun777 · 2026-04-16T03:56:51Z

/review

github-actions · 2026-04-16T03:58:48Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-04-16T03:58:50Z

PR approved by anyone and no changes requested.

github-actions

Findings

regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new window case does not reach SimplifyWindowExpression, so it does not actually validate the isCountStar() consumer named in the PR.
regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new join case does not reach PushDownAggThroughJoin's count(*) rewrite branch, so it also does not validate the motivating join path described in the PR.
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/agg/Count.java: this helper also gates AggregateStrategies (COUNT_ON_INDEX / COUNT_ON_MATCH), but the new tests do not cover that implementation path yet.

Critical Checkpoints

Goal: The one-line change in Count.isCountStar() fixes the direct count(null) misclassification. The new unit test proves the helper behavior, but the added regression suite does not fully prove all claimed downstream consumers.
Scope: The change is small, focused, and centralized in the right helper.
Concurrency: No concurrency, locking, or lifecycle risk is involved here.
Config / compatibility: No new config, protocol, or storage compatibility impact.
Parallel code paths: isCountStar() is shared by multiple rewrite and implementation rules. The code updates them centrally, but coverage is still incomplete for the window, join, and count-on-index consumers.
Special conditions: The new null-literal exclusion is the right semantic distinction.
Tests: CountLiteralRewriteTest improves direct coverage, and the union regression is useful. However the new window and join regressions do not hit the named rewrite rules, and there is still no coverage for the COUNT_ON_INDEX fast path. I also could not run FE UT successfully in this runner because thirdparty/installed/bin/protoc is missing.
Result files: The new .out file is consistent with the SQL in the added suite.
Observability / transaction / persistence / FE-BE variable passing: Not applicable for this change.
Performance: No new performance concern found.
Other: I did not find another blocking correctness bug in the code change itself; the blocking issue is coverage quality for the affected consumers.

Requesting changes so the added tests exercise the actual rewrite / implementation paths that motivated this fix.

github-actions · 2026-04-16T04:26:28Z

+        ) t
+    """
+
+    // count(null) as window function


This window query does not actually exercise SimplifyWindowExpression. That rule only rewrites when the partition by keys are proven unique (window.getLogicalProperties().getTrait().isUnique(partitionSlotSet)), but this test has no partition keys, so it would skip the rule before and after the fix. Please add a unique-partition case (or a direct FE rule test) if the goal is to cover the isCountStar() consumer in SimplifyWindowExpression.

Fixed. Window test kept as a correctness check (count(null)=0 in window context). SimplifyWindowExpression requires provably unique partition keys which is hard to set up in regression tests — the core isCountStar() behavior is already covered by FE unit test testCountNullIsNotCountStar.

github-actions · 2026-04-16T04:26:28Z

+        select pk, count(null) over (order by pk) from count_null_test order by pk
+    """
+
+    // count(null) through join must still return 0, not get rewritten by PushDownAggThroughJoin


This join case does not reach PushDownAggThroughJoin's count(*) rewrite branch. With empty GROUP BY, PushDownAggThroughJoin#pushAgg returns null as soon as countStars is non-empty, so even the old buggy count(null) -> isCountStar() classification would not rewrite this query. To cover the motivating join scenario, the test needs a grouped join (or a plan test in PushDownCountThroughJoinTest) that actually enters the count(*) rewrite path.

Fixed. Replaced with a grouped join: select t1.pk, count(null), count(*) from ... group by t1.pk. Also added FE plan test testCountNullNotPushedDown in PushDownCountThroughJoinTest.java applying only PushDownAggThroughJoin to verify count(null) is NOT pushed down.

### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Count.isCountStar() returns true for count(null) because it only checks `child(0) instanceof Literal` without excluding NullLiteral. Since count(null) always returns 0 (it counts non-null values of a constant null), it is semantically different from count(*) which counts all rows. This could cause incorrect results if downstream rewrite rules (e.g. PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll) use isCountStar() before CountLiteralRewrite has a chance to rewrite count(null) to 0. ### Release note Fixed count(null) being incorrectly treated as count(*) in query optimizer, which could lead to wrong results in certain rewrite scenarios. ### Check List (For Author) - Test: Unit Test / Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

yujun777 · 2026-04-16T09:46:35Z

run buildall

yujun777 · 2026-04-16T09:46:37Z

/review

github-actions

I agree with the semantic fix in Count.isCountStar() itself: excluding null literals is the right change, and the implementation is small and focused. The blocking issue is that the new coverage does not actually prove all of the downstream optimizer paths called out in the PR description.

count_null_union does not exercise PushCountIntoUnionAll. CountLiteralRewrite runs earlier in the rewrite pipeline and folds count(null) to constant 0 before the union-all pushdown stage sees it, so this query would still pass if that rule regressed.
count_null_window does not exercise SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique; the added query has no PARTITION BY, so a regression in checkCount() / isCountStar() there would go unnoticed.
The inverted-index case only checks the negative explain. If count(*) stopped using COUNT_ON_MATCH, the test would still pass because explainCountStar is never asserted.

Critical Checkpoints

Goal of current task: Prevent count(null) from being treated as count(*). The code change in Count.java does that locally, but the current tests do not fully prove the downstream paths named in the PR description.
Modification size/focus: Small and focused.
Concurrency: Not involved in the touched code.
Lifecycle/static initialization: Not involved.
Config changes: None.
Compatibility / FE-BE protocol / persistence: None.
Parallel code paths: Yes. Several optimizer paths depend on row-count semantics. The added coverage does not actually pin PushCountIntoUnionAll, does not hit SimplifyWindowExpression, and only partially checks the COUNT_ON_INDEX path.
Special conditions: The new null-literal guard is semantically appropriate.
Test coverage: Incomplete for the risky paths above.
Test result updates: The added .out file matches the SQL results that are asserted.
Observability: No change needed.
Transactionality / data writes / new FE-BE variables: Not applicable.
Performance: No concern from the code change itself.
Other issues: None beyond the missing coverage.

Please tighten the new tests so they actually exercise the optimizer branches this PR is claiming to fix.

github-actions · 2026-04-16T10:33:15Z

+    order_qt_count_mixed """select count(null), count(*), count(a), count(b) from count_null_test"""
+
+    // count(null) in subquery with union all
+    order_qt_count_null_union """


count(null) has already been rewritten to constant 0 by CountLiteralRewrite before the eager-aggregation stage reaches PushCountIntoUnionAll, so this query does not actually pin the union-all pushdown path mentioned in the PR description. It would still pass even if PushCountIntoUnionAll mishandled count(null). Please add a rule-level FE test or a plan assertion that exercises that rewrite directly.

Fixed. Added FE unit test testCountNullPushedAsCountNullNotCountStar in PushCountIntoUnionAllTest.java.

Note: CountLiteralRewrite does not rewrite count(null) to 0 in the no-GROUP-BY case. At line 66 of CountLiteralRewrite.java, it guards with aggFuncs.isEmpty() — when count(null) is the only aggregate and there are no GROUP BY keys, the resulting aggregate would have zero output expressions, so the rule returns null (no-op). PushCountIntoUnionAll therefore does see count(null) in this case.

The new FE unit test directly verifies: (1) PushCountIntoUnionAll fires (Sum0 appears in upper agg), and (2) child agg expressions do NOT contain a Count with isCountStar=true.

github-actions · 2026-04-16T10:33:15Z

+    """
+
+    // count(null) as window function should still return 0
+    order_qt_count_null_window """


This window case does not reach SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique, and an empty partition-key set is not treated as unique. With count(null) over (order by pk) the regression only checks result correctness, not the optimizer branch called out in the PR description. Please add a PARTITION BY case on a unique key such as pk so checkCount() / isCountStar() is exercised.

Fixed. Added a UNIQUE KEY table count_null_uniq_test with PARTITION BY pk so isUnique({pk}) returns true and SimplifyWindowExpression does fire. The test asserts count(null) over (partition by pk) = 0 for each row with explicit assertEquals(0L, row[1]) assertions. With the fix, isCountStar() returns false for count(null) → checkCount() returns false → window is NOT simplified → correctly returns 0.

github-actions · 2026-04-16T10:33:15Z

+
+    // count(*) with MATCH predicate may use COUNT_ON_MATCH optimization
+    // count(null) with same predicate must NOT — verify via explain
+    def explainCountStar = sql """


Only the negative explain is asserted here. If the positive optimization for count(*) stopped producing COUNT_ON_MATCH, this test would still pass because explainCountStar is unused. Please also assert that the count(*) explain contains COUNT_ON_MATCH or COUNT_ON_INDEX so this case proves the optimization path is actually exercised.

Fixed. Replaced the explain shape plan + assertFalse(contains 'COUNT_ON_MATCH') block — which was trivially always-true since explain shape plan calls shapeInfo() (returns class name only, never includes pushAggOp) — with proper explain {} DSL blocks:

count(*) asserts contains "pushAggOp=COUNT_ON_INDEX" (positive)

count(null) asserts notContains "pushAggOp=COUNT_ON_INDEX" (negative)

Note: PhysicalPlanTranslator maps COUNT_ON_MATCH → TPushAggOp.COUNT_ON_INDEX, so explain output shows pushAggOp=COUNT_ON_INDEX for both COUNT_ON_INDEX and COUNT_ON_MATCH physical paths. The positive assertion proves the optimization is actually exercised for count(*).

yujun777 · 2026-04-16T11:39:29Z

run feut

yujun777 · 2026-04-16T11:39:39Z

run vault_p0

hello-stephen · 2026-04-16T12:25:59Z

FE Regression Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

… optimizer paths ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: Address three reviewer concerns on PR apache#62548: 1. PushCountIntoUnionAll: Add FE unit test testCountNullPushedAsCountNullNotCountStar verifying count(null) is pushed down as count(null) not as count(*). CountLiteralRewrite skips count(null) with no GROUP BY (would produce empty agg), so PushCountIntoUnionAll does see count(null). The new test explicitly pins this behavior. 2. SimplifyWindowExpression: Add regression test with PARTITION BY on a UNIQUE KEY table (count_null_uniq_test). The rule only fires when partition keys are proven unique; this test directly exercises checkCount() → isCountStar() for count(null). 3. COUNT_ON_INDEX: Replace explain shape plan (which only shows class name, not pushAggOp) with the explain{} DSL block. Now asserts count(*) DOES use pushAggOp=COUNT_ON_INDEX and count(null) does NOT, making the test non-trivial. ### Release note None ### Check List (For Author) - Test: Unit Test / Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

yujun777 · 2026-04-20T02:45:16Z

run buildall

yujun777 · 2026-04-20T02:45:18Z

/review

github-actions

No blocking issues found.

Critical Checkpoints

Goal: The change prevents count(null) from being treated as count(*) in shared Count.isCountStar() logic. The one-line fix is correct, and the added UT/regression coverage now exercises the direct helper behavior plus the union-all, join, window, and count-on-index consumers called out in the PR.
Scope: The modification is small, focused, and centralized in the right helper.
Concurrency: No concurrency, locking, or lifecycle risk is involved in the touched code.
Lifecycle / static init: No special lifecycle or static initialization impact.
Config: No new configuration or dynamic-config behavior.
Compatibility: No FE-BE protocol, persistence, or rolling-upgrade compatibility impact.
Parallel code paths: I checked the current isCountStar() consumers; the shared helper fix covers the relevant rewrite and implementation paths, and I did not find another reachable correctness gap for count(null).
Special conditions: Excluding null literals is the right semantic distinction. count(*) and non-null literal counts keep their existing behavior.
Test coverage: The new FE tests and regression suite are targeted and sufficient for this fix. The new .out file is consistent with the deterministic queries. I could not rerun FE UT in this runner because thirdparty/installed/bin/protoc is missing, so the verification here is based on code and plan-path inspection.
Observability: No additional logs or metrics are needed for this change.
Transactionality / data writes / FE-BE vars: Not applicable.
Performance: No new performance concern found.
Other: No additional blocking issues found.

abcxyz333 · 2026-04-20T03:18:20Z

run vault_p0

hello-stephen · 2026-04-20T05:20:25Z

FE Regression Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

abcxyz333 · 2026-04-20T06:26:50Z

run nonConcurrent

hello-stephen · 2026-04-20T08:22:12Z

FE Regression Coverage Report

Increment line coverage 0.00% (0/23) 🎉
Increment coverage report
Complete coverage report

fixed

github-actions · 2026-04-21T06:45:13Z

PR approved by at least one committer and no changes requested.

#62548) ### What problem does this PR solve? Problem Summary: `Count.isCountStar()` returns `true` for `count(null)` because it only checks `child(0) instanceof Literal` without excluding `NullLiteral`. Since `count(null)` always returns 0 (it counts non-null values of a constant null), it is semantically different from `count(*)` which counts all rows. This could cause incorrect results if downstream rewrite rules (e.g. `PushDownAggThroughJoin`, `SimplifyWindowExpression`, `PushCountIntoUnionAll`) use `isCountStar()` before `CountLiteralRewrite` has a chance to rewrite `count(null)` to `0`. ### Release note Fixed `count(null)` being incorrectly treated as `count(*)` in query optimizer, which could lead to wrong results in certain rewrite scenarios. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…) as count(*) #62548 (#62667) Cherry-picked from #62548 Co-authored-by: yujun <yujun@selectdb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…) as count(*) #62548 (#62666) Cherry-picked from #62548 Co-authored-by: yujun <yujun@selectdb.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

morrySnow previously approved these changes Apr 16, 2026

View reviewed changes

morrySnow added dev/4.0.x dev/4.1.x labels Apr 16, 2026

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026

github-actions Bot added the reviewed label Apr 16, 2026

github-actions Bot requested changes Apr 16, 2026

View reviewed changes

yujun777 dismissed morrySnow’s stale review via aed0d9c April 16, 2026 09:45

yujun777 force-pushed the fix/count-null-is-not-count-star branch from ca09b74 to aed0d9c Compare April 16, 2026 09:45

github-actions Bot removed the approved Indicates a PR has been approved by one committer. label Apr 16, 2026

github-actions Bot previously requested changes Apr 16, 2026

View reviewed changes

github-actions Bot reviewed Apr 20, 2026

View reviewed changes

morrySnow approved these changes Apr 21, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 21, 2026

morrySnow merged commit 9ba05cd into apache:master Apr 21, 2026
33 checks passed

github-actions Bot mentioned this pull request Apr 21, 2026

branch-4.0: [fix](fe) Fix isCountStar incorrectly treating count(null) as count(*) #62548 #62666

Merged

github-actions Bot mentioned this pull request Apr 21, 2026

branch-4.1: [fix](fe) Fix isCountStar incorrectly treating count(null) as count(*) #62548 #62667

Merged

yiguolei added dev/4.1.1-merged and removed dev/4.1.x labels Apr 22, 2026

yiguolei added dev/4.0.6-merged and removed dev/4.0.x labels Apr 22, 2026

Conversation

yujun777 commented Apr 16, 2026

What problem does this PR solve?

Release note

Check List (For Author)

Uh oh!

Thearas commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

yujun777 commented Apr 16, 2026

Uh oh!

hello-stephen commented Apr 16, 2026

FE Regression Coverage Report

Uh oh!

yujun777 commented Apr 20, 2026

Uh oh!

yujun777 commented Apr 20, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

abcxyz333 commented Apr 20, 2026

Uh oh!

hello-stephen commented Apr 20, 2026

FE Regression Coverage Report

Uh oh!

abcxyz333 commented Apr 20, 2026

Uh oh!

hello-stephen commented Apr 20, 2026

FE Regression Coverage Report